AIBase
Home
AI NEWS
AI Tools
AI Models
MCP
AI Services
AI Compute
AI Tutorial
EN

AI News

View More

Challenging Conventions: A Breakthrough Transformer Architecture Without Normalization Layers

In the field of deep learning, normalization layers are considered an indispensable component of modern neural networks. Recently, research led by Meta FAIR research scientist Zhuang Liu, titled "Transformer without Normalization Layers", has garnered significant attention. This research not only introduces a new technique called Dynamic Tanh (DyT), but also demonstrates the effectiveness of Transformer architectures even without traditional normalization layers.

8.7k 6 hours ago
Challenging Conventions: A Breakthrough Transformer Architecture Without Normalization Layers

Models

View More

GLM-4.5-X

Chatglm

GLM-4.5-X

$8

Input tokens/M

$16

Output tokens/M

128

Context Length

Qwen3-0.6B

Alibaba

Qwen3-0.6B

$0.3

Input tokens/M

-

Output tokens/M

32

Context Length

Tencent Hunyuan Image Generation (Multi-round Dialogue)

Tencent

Tencent Hunyuan Image Generation (Multi-round Dialogue)

-

Input tokens/M

-

Output tokens/M

-

Context Length

Baichuan-7B

Baichuan

Baichuan-7B

-

Input tokens/M

-

Output tokens/M

4

Context Length

AIBase
Empowering the future, your artificial intelligence solution think tank
English简体中文繁體中文にほんご
FirendLinks:
AI Newsletters AI ToolsMCP ServersAI NewsAIBaseLLM LeaderboardAI Ranking
© 2026AIBase
Business CooperationSite Map